Learning Objectives

After completing this lesson, you’ll be able to:

Resources

Exercise

Your manager just assigned you to take over a project from your colleague, and they passed their workspace on to you. This project is to calculate the "walkability" of each address in the city of Vancouver. Walkability measures how easy it is to access local facilities on foot. The workspace will measure the distance to the nearest park, the amount of crime in an area, and other similar metrics.

The workspace currently assesses crime, parks, and noise-control areas, but it doesn't give an overall measure of walkability.

Let's build on their workspace and use our debugging skills to address any problems we encounter.

1) View Starting Workspace

Start FME Workbench (2024.0 or later) and open the starting workspace. Then, run the workspace to cache the data.

First, let's figure out what this workspace does:

Sections of starting workspace

  1. Reading Addresses.gdb creates the PostalAddress feature type.
  2. Transformers clean attributes from the PostalAddress feature type and create a separate Number and Street attribute. The transformers replace the number's last two digits with XX to create an attribute that will be the Join Key for joining the crime data.
  3. Reading crime.csv creates the Crime feature type. Substituting XX for the last two digits anonymizes the street number for each crime incident.
  4. The FeatureJoiner joins PostalAddress and Crime based on the Join Key attribute created in 2 and the Block attribute from Crime.
  5. Transformers set the crime Type attribute to a number based on severity and then calculate the total CrimeValue for each address block. The CenterPointReplacer ensures only one point exists if multiple crime incidents occur in the same location.
  6. Reading Parks.tab creates the Parks feature type. This data allows us to measure the walking distance from addresses to parks.
  7. Using the NeighborFinder, the park closest to each address is determined.
  8. The NeighborFinder creates the _distance attribute. The AttributeRenamer renames it to ParkDistance.
  9. The Creator and FeatureReader read the Planning Restrictions OGC Geopackage. That dataset's NoiseControlAreas feature type contains noise restriction area polygons.
  10. The PointOnAreaOverlayer joins the address point data containing the crime, distance to park, and addresses with the NoiseControlAreas polygons. The merged data assigns the noise restrictions to any overlapping points.
  11. The AttributeValueMapper creates the attribute NoiseZoneScore, giving each address a score based on its zone. This new attribute reflects that addresses in noise-restricted areas are more walkable. The ExpressionEvaluator calculates the final Walkability score.

2) Inspect the ExpressionEvaluator Transformer

The ExpressionEvaluator transformer creates a measure of walkability that combines the values from crime, park proximity, and noise zones.

Inspect the parameters of the ExpressionEvaluator transformer to the end of the workspace.

It creates a new attribute called Walkability that is:

@Value(ParkDistance) + @Value(CrimeValue) - @Value(NoiseZoneScore)

ExpressionEvaluator expression

With this expression, the smaller the result, the more walkable an address.

3) Assess the Result

Let's assess whether the result of the translation is correct.

Firstly, check the log window for errors and warnings. There are no errors, but there are many warnings, which is not a good sign:

Many warnings

Note

The number of warnings in the Translation Log may differ in your workspace. These numbers can vary based on the Logging Parameters set in FME Options.

Click on the warnings button to filter out the warnings. The warnings say:

Null, missing, or empty string operand was found in expression '@Value(ParkDistance) + @Value(CrimeValue) - @Value(NoiseZoneScore)'.  Result is set to null

Inspect the output cache on the ExpressionEvaluator (you can click the link next to the warning in the log to focus on it). You will find some addresses have a Walkability value of <null>.

We know there is a problem. Let's find out where the problem is and why it occurs.

Note

There were no errors, but the workspace's output is still incorrect. Always inspect your workspace's results to ensure you have configured it correctly.

4) Locate the Problem

We can tell the warning comes from the ExpressionEvaluator, but that doesn't necessarily mean that is where the problem lies.

Because we know a null, missing, or empty string is the problem, we can inspect the ExpressionEvaluator cache to find the source of the problem. A practical way to do this is to right-click ParkDistance, CrimeValue, and NoiseZoneScore in the Table View window and sort them by ascending numeric order. This sorting puts any null or missing values at the top of the table. 

Doing this will reveal that CrimeValue has <missing> values. So, the calculation in the ExpressionEvaluator fails because the middle value is <missing>. Let's find out why some of these features have missing CrimeValue values.

Inspect the FeatureJoiner caches because that's where we first get our Crime data:

Viewing feature caches on the FeatureJoiner

The FeatureJoiner does not have missing values, so let's proceed with the translation. Check the cache for the AttributeValueMapper. This transformer sets values, so perhaps missing values are coming from it.

If you inspect the AttributeValueMapper cache, you'll see no missing values for the CrimeValue or the crime Type attribute. There are also no missing values in the Aggregator and CenterPointReplacer caches.

What about the 3,698 features that do not have a crime; what CrimeValue do they get? Inspect the UnjoinedLeft output from the FeatureJoiner, and you will see that they do not have the CrimeValue attribute. That's why the ExpressionEvaluator says that there are missing values. These features do not have a CrimeValue because they don't enter the AttributeValueMapper, which assigns a value to CrimeValue.

You can confirm this issue by inspecting the NeighborFinder's MatchedBase cache, which contains addresses with and without crime values. You can sort CrimeValue and see that it has missing values here:

Missing values for CrimeValue

Note

Due to a bug, if you are using FME 2025.0, you may not see the City, Province, and other Base feature attributes. You can check Attribute Accumulation > Merge Attributes at the bottom of the NeighborFinder parameters to ensure the transformer exposes them.

5) Fix the Problem

Those features not having a CrimeValue attribute is causing the problem, so we should give them one. To do so, add an AttributeCreator transformer to the workspace between the FeatureJoiner's UnjoinedLeft output port and the NeighborFinder's Base input port:

Adding an AttributeCreator

Open its parameters and create an attribute called CrimeValue with a value of zero (0).

Adding the CrimeValue attribute with a value of 0

Run the workspace, which will run from the AttributeCreator to the ExpressionEvaluator. You should now find fewer warnings and that the Walkability attribute contains no <null> values. Take note of the rounded max value of Walkability: 956.

6) Add Swimming Pools

The city has decided that parks are not a great candidate for walkability scores because there is usually a park nearby. They decided instead to include the walking distance to the nearest swimming pool.

With just a few minor updates, we can reuse the same workflow for swimming pools that we used for parks.

First, let's add a new reader with the following parameters:

Reader Format

OpenStreetMap (OSM) XML

Reader Dataset

https://s3.amazonaws.com/FMEData/FMEData/Data/OpenStreetMap/leisure.osm or C:\FMEData\Data\OpenStreetMap\leisure.osm

When prompted, select only the leisure feature type:

Selecting the leisure feature type

Then, move the new leisure reader near the Parks reader and connect it to the NeighborFinder's Candidate input port. Then right-click on the Parks reader and select Disable.

7) Filter Leisure Data

If you inspect the leisure data, you'll notice various leisure facility types, with the type recorded in the leisure attribute.

So, add a Tester transformer between the leisure reader and the NeighborFinder. Set up the parameters to test for leisure = swimming_pool:

Adding and configuring a Tester to filter to only swimming pools

8) Update Transformer Parameters

Now update AttributeRenamer to be PoolDistance instead of ParkDistance. Renaming this attribute will cause the ExpressionEvaluator to turn red.

To fix the ExpressionEvaluator, open the parameters and change @Value(ParkDistance) to @Value(PoolDistance) to take account of the new PoolDistance attribute:

@Value(PoolDistance) + @Value(CrimeValue) - @Value(NoiseZoneScore)

You'll also have to do the same thing for the AttributeKeeper transformer.

Re-run the workspace. Check the log for warnings and errors, then inspect the ExpressionEvaluator cache.

Notice that the walkability scores are suddenly exceedingly large due to the PoolDistance. The new max value is 5,477,800. Something is wrong, but what?

9) Locate Problem

PoolDistance is the source of the problem. There is no related log message to give a clue, and the Feature Count numbers look correct.

Let's inspect the data. Click on the leisure reader, and while holding the Shift key, click on the NeighborFinder. This step will open all the selected caches in Visual Preview.

Note

If you have Toggle Automatic Inspect on Selection disabled, you'll have to right-click on either object and select Inspect Cached Features after selecting them both, or Ctrl click the cache itself instead of the transformer.

Right-click in the Graphics view, go to Background Map, and select Background map off. Visual Preview shows two specks of data a long distance apart. This result is typical of a mismatch of coordinate systems.

Note

We turn the background map off because otherwise, Visual Preview automatically reprojects data with mismatched coordinate systems. Turning the background map off lets us see these are not using the same coordinate system.

Click on some features and select the Feature Information button. In this window, you will see that the primary data has a coordinate system of UTM83-10, while the leisure data from OSM has a coordinate system of LL84.

This disparity is why the "nearest" pool to each address is such a high distance.

10) Fix Coordinate System Problem

The obvious solution is to reproject the pools to the correct coordinate system. So, add a Reprojector transformer to reproject the leisure data before it gets to the NeighborFinder:

Adding a Reprojector

Inspect its parameters and set it up to reproject from LL84 to UTM83-10.

Re-run the appropriate parts of the workspace. Check the log window and inspect the ExpressionEvaluator cache.

Each address now has a walkability score account for pools instead of parks, with a lower number being better and a higher number worse. The new (correct, rounded) maximum is 4,308.

Congratulations on debugging this workspace.